AITopics | composable diffusion

Collaborating Authors

composable diffusion

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Any-to-Any Generation via Composable Diffusion

Neural Information Processing SystemsDec-24-2025, 13:06:11 GMT

Unlike existing generative AI systems, CoDi can generate multiple modalities in parallel and its input is not limited to a subset of modalities like text or image. Despite the absence of training datasets for many combinations of modalities, we propose to align modalities in both the input and output space. This allows CoDi to freely condition on any input combination and generate any group of modalities, even if they are not present in the training data. CoDi employs a novel composable generation strategy which involves building a shared multimodal space by bridging alignment in the diffusion process, enabling the synchronized generation of intertwined modalities, such as temporally aligned video and audio. Highly customizable and flexible, CoDi achieves strong joint-modality generation quality, and outperforms or is on par with the unimodal state-of-the-art for single-modality synthesis.

any-to-any generation, composable diffusion, modality, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.40)

Add feedback

CO3: Contrasting Concepts Compose Better

Dutta, Debottam, Chen, Jianchong, Rajagopalan, Rajalaxmi, Wei, Yu-Lin, Choudhury, Romit Roy

arXiv.org Artificial IntelligenceOct-1-2025

We propose to improve multi-concept prompt fidelity in text-to-image diffusion models. We begin with common failure cases-prompts like "a cat and a dog" that sometimes yields images where one concept is missing, faint, or colliding awkwardly with another. We hypothesize that this happens when the diffusion model drifts into mixed modes that over-emphasize a single concept it learned strongly during training. Instead of re-training, we introduce a corrective sampling strategy that steers away from regions where the joint prompt behavior overlaps too strongly with any single concept in the prompt. The goal is to steer towards "pure" joint modes where all concepts can coexist with balanced visual presence. We further show that existing multi-concept guidance schemes can operate in unstable weight regimes that amplify imbalance; we characterize favorable regions and adapt sampling to remain within them. Our approach, CO3, is plug-and-play, requires no model tuning, and complements standard classifier-free guidance. Experiments on diverse multi-concept prompts indicate improvements in concept coverage, balance and robustness, with fewer dropped or distorted concepts compared to standard baselines and prior compositional methods. Results suggest that lightweight corrective guidance can substantially mitigate brittle semantic alignment behavior in modern diffusion systems.

diffusion model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2509.2594

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Improving Compositional Generation with Diffusion Models Using Lift Scores

Yu, Chenning, Gao, Sicun

arXiv.org Artificial IntelligenceMay-27-2025

We introduce a novel resampling criterion using lift scores, for improving compositional generation in diffusion models. By leveraging the lift scores, we evaluate whether generated samples align with each single condition and then compose the results to determine whether the composed prompt is satisfied. Our key insight is that lift scores can be efficiently approximated using only the original diffusion model, requiring no additional training or external modules. We develop an optimized variant that achieves relatively lower computational overhead during inference while maintaining effectiveness. Through extensive experiments, we demonstrate that lift scores significantly improved the condition alignment for compositional generation across 2D synthetic data, CLEVR position tasks, and text-to-image synthesis. Our code is available at http://rainorangelemon.github.io/complift.

artificial intelligence, complift, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2505.1374

Country: North America (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Any-to-Any Generation via Composable Diffusion

Neural Information Processing SystemsOct-11-2024, 02:35:22 GMT

any-to-any generation, composable diffusion, modality, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.44)

Add feedback

MIT AI Image Generator System Makes Models Like DALL-E 2 More Creative

#artificialintelligenceSep-25-2022, 07:20:18 GMT

A sample DALL·E 2 generated image of "an astronaut riding a horse in a photorealistic style." A new method developed by researchers uses multiple models to create more complex images with better understanding. With the introduction of DALL-E, the internet had a collective feel-good moment. This artificial intelligence-based image generator is inspired by artist Salvador Dali and the lovable robot WALL-E and uses natural language to produce whatever mysterious and beautiful image your heart desires. Seeing typed-out inputs such as "smiling gopher holding an ice cream cone" instantly spring to life is a vivid AI-generated image clearly resonated with the world.

dall-e 2, diffusion model, image generator system make model, (11 more...)

#artificialintelligence

Country: North America > United States > Illinois (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

AI system makes models like DALL-E 2 more creative

#artificialintelligenceSep-8-2022, 13:06:59 GMT

The internet had a collective feel-good moment with the introduction of DALL-E, an artificial intelligence-based image generator inspired by artist Salvador Dali and the lovable robot WALL-E that uses natural language to produce whatever mysterious and beautiful image your heart desires. Seeing typed-out inputs like "smiling gopher holding an ice cream cone" instantly spring to life clearly resonated with the world. Getting said smiling gopher and attributes to pop up on your screen is not a small task. DALL-E 2 uses something called a diffusion model, where it tries to encode the entire text into one description to generate an image. But once the text has a lot of more details, it's hard for a single description to capture it all.

ai system make model, dall-e 2, diffusion model, (14 more...)

#artificialintelligence

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.40)
North America > United States > Illinois (0.05)

Industry: Automobiles & Trucks (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback